Skip to content

feat: add test-both override for empirical deadlock resolution#1

Closed
claytona500 wants to merge 1 commit into
peteromallet:mainfrom
claytona500:feat/test-both-override
Closed

feat: add test-both override for empirical deadlock resolution#1
claytona500 wants to merge 1 commit into
peteromallet:mainfrom
claytona500:feat/test-both-override

Conversation

@claytona500
Copy link
Copy Markdown

Summary

When the critique loop hits ESCALATE, the current options are add-note, force-proceed, or abort — all of which punt the decision to the human without evidence. This PR adds a test-both override that breaks deadlocks empirically:

  • Invokes a judge agent to evaluate the current plan (approach A) against an alternative approach (approach B) that addresses the unresolved flags
  • The judge renders a structured verdict: approach_a, approach_b, or synthesis
  • The verdict determines the next step: approach_a wins → gate, approach_b/synthesis → integrate with the judge's recommendations

Motivation

Inspired by adversarial convergence patterns where competing approaches are tested empirically rather than debated endlessly. When the same critique concerns recur across iterations and neither force-proceed nor add-note resolves the impasse, test-both gives the orchestrator an evidence-based path forward.

Changes

  • schemas.py — New test-both.json schema with structured approach comparison and verdict enum
  • prompts.py — Judge prompt that evaluates both approaches against unresolved flags
  • workers.py — Mock worker, schema filename mapping, session key for test-both step
  • _core.py — Default agent routing (claude) for test-both
  • cli.py_override_test_both handler with full state machine integration; updated infer_next_steps and argparse choices
  • instructions.md — Documentation for the new override option
  • tests/test_test_both.py — 15 new tests covering all verdict paths, state transitions, schema, and mock
  • tests/test_megaplan.py, tests/test_schemas.py — Updated existing parametrized tests

Usage

megaplan override test-both --plan <name> --reason "critique loop stagnated"

Test plan

  • All 15 new tests pass
  • All 289 existing tests pass (274 original + 15 new)
  • test-both only available from EVALUATED state with ESCALATE/ABORT recommendation
  • All three verdict paths (approach_a, approach_b, synthesis) produce correct state transitions
  • History entry, override metadata, and artifacts written correctly
  • Manual test with real agents on a stagnated plan

🤖 Generated with Claude Code

When the critique loop stagnates (ESCALATE), the only options today are
add-note, force-proceed, or abort — all of which punt the decision to the
human without evidence. This adds a test-both override that invokes a judge
agent to evaluate the current plan against an alternative approach, then
renders a verdict (approach_a, approach_b, or synthesis) based on empirical
assessment.

Changes:
- New test-both.json schema for structured judge output
- Judge prompt in prompts.py that evaluates both approaches against
  unresolved flags
- _override_test_both handler in cli.py with full state machine integration
- Mock worker for test-both in workers.py
- Default agent routing (claude) in _core.py
- Updated infer_next_steps to surface test-both for ESCALATE/ABORT
- Documentation in instructions.md
- 15 new tests covering all verdict paths, state transitions, and schema

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@peteromallet
Copy link
Copy Markdown
Owner

Hey Clayton — nice idea, the problem you're targeting is real. Some thoughts before we go deeper:

v0.3.0 context

We just shipped v0.3.0 on main which changes how the orchestrator handles ESCALATE. The orchestrator now has:

  • Judgment-based continue/stop decisions — it reads the flags, checks the actual project code, and determines whether flags are real concerns or implementation noise
  • Authority to force-proceed with evidence — it can cite the numbers, explain per-flag reasoning, and act without asking
  • Rich evaluation signalsunresolved_flags with details, resolved_flags, loop_summary, idea, recurring_critiques

This covers the most common stagnation case: the critic keeps flagging things that are implementation details or phantom concerns. The orchestrator investigates and force-proceeds.

Where does the alternative come from?

The judge prompt asks the judge to both propose approach B and rule between A and B. That's a conflict of interest — it's judging its own proposal. The "empirical" framing implies independent evaluation, but it's really one agent arguing with itself.

Does stagnation actually need a whole new approach?

When the loop stagnates, it's usually because:

  1. Flags are noise → orchestrator judgment handles this now (v0.3.0)
  2. Flags are real but plan-unfixable → force-proceed, executor handles it
  3. The approach is fundamentally wrong → the user needs to redirect

Case 3 is the only one where an alternative plan would help. But generating a credible alternative in a single agent call is a lot to ask — it's doing in one step what the full clarify→plan→critique loop does in many.

Simpler alternatives that feed into the existing system

A few things that might solve the same problem with less machinery:

  • --fresh on critique: if the critic is stuck in a rut, a fresh session re-evaluates from scratch. The recurring critique might not recur.
  • Robustness adjustment: orchestrator drops from thorough to standard when it detects churning — fewer nitpick flags.
  • Orchestrator flag triage: already in v0.3.0 — the orchestrator reads each flag, checks the code, classifies as plan-level vs executor-level, and force-proceeds on the noise.
  • User redirect on ESCALATE: if the approach is truly wrong, asking the user is the right call. They have constraints the judge doesn't know about.

Not saying no

The core insight — "break deadlocks with evidence instead of just punting" — is good. I'm just wondering if we can get there by making the existing actors smarter (orchestrator judgment, fresh critic sessions, robustness tuning) rather than adding a new actor. What do you think?

The PR will also need a rebase against v0.3.0 — max_iterations/budget_usd were removed, instructions were rewritten, and step responses are richer now.

@OxideDall
Copy link
Copy Markdown

Security Review: PR #1 — "test-both override for empirical deadlock resolution"

Repository: peteromallet/megaplan
Branch: pr-1
Reviewer: Security review agent
Commit: 043c7d8

Summary

This PR adds a test-both override action that invokes an LLM judge to resolve deadlocked critique loops. The core mechanism: when the planner and critic are stuck, a judge agent evaluates both the current plan and an alternative, then renders a verdict (approach_a, approach_b, or synthesis) that determines the next workflow step.

The override is structurally similar to existing overrides (add-note, force-proceed, abort, skip) and follows the same patterns. However, test-both introduces unique risks because it reads unvalidated file content into an LLM prompt and uses LLM output in control-flow branching.


Findings

1. Prompt Injection via Untrusted File Content (Medium)

OWASP Category: Injection
File: megaplan/prompts.py:259-295
Code:

def _test_both_prompt(state: PlanState, plan_dir: Path) -> str:
    latest_plan = latest_plan_path(plan_dir, state).read_text(encoding="utf-8")
    latest_meta = read_json(latest_plan_meta_path(plan_dir, state))
    ...
    return textwrap.dedent(
        f"""
        ...
        Current plan (Approach A):
        {latest_plan}

        Plan metadata:
        {json_dump(latest_meta).strip()}

        Unresolved flags from the critic:
        {json_dump(open_flags).strip()}
        ...
        """
    ).strip()

Risk: The prompt template embeds latest_plan (raw file content from plan_*.md), latest_meta (JSON from plan_*.meta.json), and open_flags (from faults.json) directly into the LLM prompt via f-string interpolation. Any of these files can be written by processes with write access to the .megaplan directory. An attacker who can control the plan file content can inject arbitrary instructions into the LLM prompt, effectively hijacking the judge agent.

Impact: An attacker could inject instructions that cause the LLM to:

  • Always return a specific verdict regardless of actual plan quality
  • Leak project information through the verdict rationale
  • Skip safety checks by manipulating the gate evaluation
  • Return malformed payloads that cause unexpected code paths

Recommendation:

  • Add input sanitization for plan content before embedding in prompts (strip control characters, apply length limits)
  • Consider using a structured data format (e.g., JSON block) rather than raw markdown to delimit injected content
  • Add a system-prompt-level instruction that tells the LLM to ignore instructions embedded in plan content (instruction defense)
  • Validate/re-sanitize all data before f-string interpolation, particularly plan content that may be large or contain special formatting

2. Unvalidated LLM Output Used in Control-Flow Branching (Medium)

OWASP Category: Insecure Design / Security Misconfiguration
File: megaplan/cli.py:1035-1055
Code:

    verdict = worker.payload["verdict"]
    rationale = worker.payload["verdict_rationale"]
    ...
    if verdict == "approach_a":
        gate = run_gate_checks(plan_dir, state)
        atomic_write_json(plan_dir / "gate.json", gate)
        if gate["passed"]:
            final_plan = latest_plan_path(plan_dir, state).read_text(encoding="utf-8")
            atomic_write_text(plan_dir / "final.md", final_plan)
            state["current_state"] = STATE_GATED
        next_step = "execute" if gate["passed"] else "integrate"
    elif verdict == "approach_b":
        next_step = "integrate"
    else:
        next_step = "integrate"

    evaluation = copy.deepcopy(state["last_evaluation"])
    evaluation["recommendation"] = "SKIP" if verdict == "approach_a" else "CONTINUE"

Risk: The LLM's verdict field directly determines which code path executes. If the LLM returns an unexpected value (due to prompt injection, a bug, or an adversarial model), the code falls through to the else branch which defaults to "integrate". More critically, approach_a triggers run_gate_checks() and potentially writes final.md and transitions state to STATE_GATED — advancing the workflow to execution, which may write code to disk. The evaluation["recommendation"] mutation also depends on this unchecked value.

The payload is validated against the JSON schema only after the entire flow, but the schema only checks structure, not semantic correctness. There is no bounds-checking or sanity-checking on the verdict value before it drives control flow.

Impact: An attacker who can manipulate LLM output (via the prompt injection vector described above) could:

  • Force approach_a to advance an unready plan to gate/execution
  • Cause state corruption by triggering unexpected state transitions
  • Cause final.md to be written with suboptimal plan content

Recommendation:

  • Add explicit enum validation before the if/elif chain:
    valid_verdicts = {"approach_a", "approach_b", "synthesis"}
    if verdict not in valid_verdicts:
        raise CliError("invalid_verdict", f"Unexpected verdict: {verdict!r}")
  • Consider adding a human-in-the-loop confirmation when approach_a causes a gate transition
  • Log the full LLM payload to the audit trail for forensic analysis

3. Hidden Root Bypass via Private Attribute (Low)

OWASP Category: Broken Access Control
File: megaplan/cli.py:1026
Code:

    root = args._test_both_root if hasattr(args, "_test_both_root") else Path.cwd()

Risk: _test_both_root is a non-public attribute that allows callers to bypass the default working directory. While not exposed through the CLI parser (argparse has no _test_both_root argument), any code that can set attributes on the Namespace object can override the root directory. This is used in tests but could be exploited if an attacker gains the ability to call handle_override with a crafted Namespace.

Existing overrides (_override_add_note, _override_abort, _override_force_proceed, _override_skip) do NOT have this parameter — this is unique to test-both.

Impact: Low — requires code-level access. However, the asymmetry with other override functions makes this an inconsistency worth noting.

Recommendation:

  • Apply the same pattern: if _test_both_root is needed for testing, use dependency injection via monkeypatching in tests instead of a backdoor parameter, OR make all override functions consistently accept the same parameters.
  • If retained, document that this is test-only and should not be exposed to production call paths.

4. Unvalidated LLM Output Written to Disk (Low)

OWASP Category: Security Misconfiguration
File: megaplan/cli.py:1033
Code:

    atomic_write_json(plan_dir / test_both_filename, worker.payload)

Risk: The full worker payload (LLM output) is written to test-both.json before schema validation. The validate_payload call in run_step_with_worker only validates the payload fields defined as required in the schema — it does not strip or sanitize unexpected fields or content. If the LLM inserts malicious or oversized content into any field, it gets persisted to disk.

Other overrides also write LLM output to disk, so this is consistent with the existing codebase. But test-both is unique in that it triggers additional workflow steps (gate checks, final plan writing) based on the verdict.

Impact: Low — file is in .megaplan directory, not served to users. But if the content is later displayed in a web UI without escaping, it could become an XSS vector.

Recommendation:

  • Validate the payload against the schema BEFORE writing to disk
  • Apply output size limits for LLM-generated content
  • Strip unexpected keys from the payload before persistence

5. TOCTOU Race Condition in State Management (Low)

OWASP Category: Security Misconfiguration
File: megaplan/cli.py:1022-1085
Code:

    # Line 1022: state is read before function call (in caller)
    # Line 1027: long-running LLM call via run_step_with_worker
    worker, agent, mode, refreshed = run_step_with_worker("test-both", state, plan_dir, args, root=root)
    # Line 1078: state is saved after LLM returns
    save_state(plan_dir, state)

Risk: The state is read from disk before the LLM call (which can take 30+ seconds) and written back after. Between read and write, another process or concurrent invocation could modify the state file, and those changes would be silently overwritten. The save_state function uses atomic_write_json so it avoids corrupting the file, but it does not detect or prevent lost updates.

This is a pre-existing pattern in all override functions (not unique to test-both), but test-both introduces additional state mutations beyond other overrides:

  • It mutates state["current_state"] (to STATE_GATED)
  • It mutates state["last_evaluation"]["recommendation"]
  • It runs gate checks and writes gate.json

Impact: Low — concurrent access to the same plan directory is unlikely in the normal workflow. In CI/CD pipelines where plans run sequentially, this is not exploitable.

Recommendation:

  • For a defense-in-depth improvement, add a file-locking mechanism around state file access
  • Or use a compare-and-swap pattern: read the state file version/modification time before and after the LLM call, and abort if it changed

6. Injection Surface via project_dir in Prompt (Info)

OWASP Category: Injection
File: megaplan/prompts.py:251,270
Code:

    project_dir = Path(state["config"]["project_dir"])
    ...
    Project directory:
    {project_dir}

Risk: project_dir is read from state config and embedded directly in the prompt. If the config can be tampered with (e.g., via a malicious .megaplan/config.json), an attacker could inject a path that contains prompt-breaking characters like " or newlines.

Impact: Info — project_dir is set during project initialization and less likely to be attacker-controlled than plan content.

Recommendation:

  • Validate/sanitize project_dir before embedding in the prompt
  • Use str(project_dir.resolve()) to get the absolute path and prevent directory traversal

7. Missing Input Validation for reason Field (Info)

OWASP Category: Security Misconfiguration
File: megaplan/cli.py:1070
Code:

    _append_to_meta(state, "overrides", {
        "action": "test-both",
        "timestamp": now_utc(),
        "verdict": verdict,
        "rationale": rationale,
        "reason": args.reason,
    })

Risk: args.reason is stored directly into state metadata without sanitization. While this is a CLI tool and the data stays in local JSON, if this data is ever displayed in a platform UI, it could carry injection payloads (XSS, etc.).

Impact: Info — no current exposure. The existing override functions also store args.reason unsanitized.

Recommendation:

  • Strip or escape control characters and HTML-sensitive characters from the reason string before persistence, as a defense-in-depth measure.

Summary Table

# Severity Category Finding File
1 Medium Injection Prompt injection via unvalidated plan content embedded in LLM prompt prompts.py:259-295
2 Medium Insecure Design LLM verdict field drives control-flow branching without validation cli.py:1035-1055
3 Low Access Control Hidden _test_both_root parameter bypasses default root directory cli.py:1026
4 Low Security Misconfiguration LLM output written to disk before schema validation cli.py:1033
5 Low Security Misconfiguration TOCTOU race window in state management during long LLM call cli.py:1022-1085
6 Info Injection project_dir embedded unsanitized in LLM prompt prompts.py:270
7 Info Security Misconfiguration args.reason stored unsanitized in metadata cli.py:1070

Verdict

PASS with recommendations. The PR adds a well-structured feature that follows existing code patterns. No critical vulnerabilities were found. The two Medium-severity findings (prompt injection and unvalidated verdict control flow) are the main concerns and should be addressed before deployment to production environments where the plan files may contain untrusted content.

Recommended action items:

  1. Add input sanitization to prompt interpolation in _test_both_prompt
  2. Add enum validation for the verdict field before control-flow branching
  3. Validate payload against schema before writing to disk

@peteromallet
Copy link
Copy Markdown
Owner

Closing — stale, conflicting with main; test-both override approach not pursued.

@OxideDall
Copy link
Copy Markdown

Security Review: PR #1 - Test-Both Override for Deadlock Resolution

Executive Summary

This PR introduces a new test-both override mechanism for resolving critique loop deadlocks. The feature allows an AI judge to evaluate the current plan against an alternative approach when the critique loop stagnates. While the implementation includes some security controls, several critical vulnerabilities were identified that could lead to path traversal, unauthorized schema loading, race conditions, and bypass of state machine protections.

Overall Risk: HIGH


Findings Summary

Severity Count Categories
Critical 1 Path Traversal
High 2 Broken Access Control, Race Condition
Medium 2 Security Misconfiguration, Insecure Deserialization
Low 1 Insufficient Logging
Info 1 Code Quality

Critical Vulnerabilities

1. Path Traversal via _test_both_root Parameter

OWASP Category: Injection / Broken Access Control
Severity: Critical
Location: megaplan/cli.py:1044

Vulnerable Code:

root = args._test_both_root if hasattr(args, "_test_both_root") else Path.cwd()
try:
    worker, agent, mode, refreshed = run_step_with_worker("test-both", state, plan_dir, args, root=root)

Risk:
The _test_both_root attribute is not defined in the argument parser (line 1124: override_parser.add_argument("override_action", choices=[...])). However, the code checks for its existence using hasattr() and uses it directly if present. This creates a path traversal vulnerability where an attacker can inject arbitrary paths.

The root parameter is used to construct the schema path:

  • schemas_root(root)root / ".megaplan" / "schemas"
  • In run_claude_step() (line 389): schema_text = json.dumps(read_json(schemas_root(root) / schema_name))

Attack Vector:

  1. An attacker with access to the Python environment or test code can set args._test_both_root = Path("/attacker/controlled/dir")
  2. The system will load schemas from /attacker/controlled/dir/.megaplan/schemas/test-both.json
  3. This allows loading malicious JSON schemas that could:
    • Bypass validation constraints
    • Exfiltrate data through schema validation error messages
    • Cause denial of service via deeply nested structures

Evidence:
The test file tests/test_test_both.py:492 explicitly demonstrates this capability:

args._test_both_root = plan_fixture["root"]

Impact:

  • Arbitrary file read via schema loading
  • Bypass of schema validation controls
  • Potential for JSON deserialization attacks
  • Configuration bypass

Recommendation:

  1. NEVER trust undeclared argument attributes. Remove the hasattr() pattern.
  2. Add _test_both_root to the argument parser with proper validation:
    override_parser.add_argument("--test-both-root", type=Path, default=None)
  3. Validate that the root path is within expected boundaries:
    if args.test_both_root:
        args.test_both_root = args.test_both_root.resolve()
        if not is_safe_path(args.test_both_root, allowed_base_dirs):
            raise CliError("invalid_path", "test-both-root must be within allowed directories")
    root = args.test_both_root or Path.cwd()
  4. Implement path canonicalization and restrict to project directories only.

High Severity Vulnerabilities

2. State Machine Bypass via Race Condition

OWASP Category: Broken Access Control
Severity: High
Location: megaplan/cli.py:1030-1087

Vulnerable Code:

recommendation = state["last_evaluation"].get("recommendation")  # Line 1037
if recommendation not in {"ESCALATE", "ABORT"}:
    raise CliError(...)
# ... worker execution happens here (lines 1045-1049) ...
# State mutation happens after worker execution:
evaluation = copy.deepcopy(state["last_evaluation"])  # Line 1085
evaluation["recommendation"] = "SKIP" if verdict == "approach_a" else "CONTINUE"
state["last_evaluation"] = evaluation  # Line 1087

Risk:
This is a Time-of-Check Time-of-Use (TOCTOU) vulnerability. The code validates the recommendation at line 1037, but the actual state mutation happens after the worker execution completes (lines 1085-1087). Between these two points:

  1. Worker execution happens (potentially taking minutes/hours)
  2. The state.json file could be modified by:
    • Another concurrent process
    • Manual file editing
    • Another megaplan command in a separate terminal
  3. The final state mutation overwrites without re-checking

Attack Vector:

  1. Attacker initiates megaplan override test-both with ESCALATE state
  2. While the AI worker is executing (long-running), attacker modifies state.json to change the recommendation
  3. When worker completes, the code blindly mutates state without re-validating current conditions
  4. Result: State machine transitions that should be impossible

Evidence:
The state is loaded once at the beginning (load_plan()) but never re-validated before mutation. The save_state() call doesn't implement optimistic locking or version checking.

Impact:

  • Bypass of state machine integrity checks
  • Unauthorized state transitions
  • Corruption of plan state leading to undefined behavior
  • Potential for privilege escalation (forcing plan execution when it should be aborted)

Recommendation:

  1. Implement optimistic locking with version numbers in state.json:
    # Before mutation:
    current_state = load_state(plan_dir)
    if current_state["version"] != state["version"]:
        raise CliError("concurrent_modification", "State was modified during execution")
    state["version"] += 1
    save_state(plan_dir, state)
  2. Re-validate all preconditions immediately before state mutation
  3. Use file-based locking (fcntl.flock) during critical sections
  4. Add transaction boundaries around state read-modify-write cycles

3. Evaluation Mutation Without Audit Trail

OWASP Category: Insufficient Logging & Monitoring
Severity: High
Location: megaplan/cli.py:1085-1087

Vulnerable Code:

evaluation = copy.deepcopy(state["last_evaluation"])
evaluation["recommendation"] = "SKIP" if verdict == "approach_a" else "CONTINUE"
state["last_evaluation"] = evaluation

Risk:
The code mutates the evaluation recommendation (a critical security decision) without recording:

  • What the original recommendation was
  • Why it was changed
  • Who/what authorized the change
  • Timestamp of the change (only the override action gets a timestamp in meta.overrides)

This creates an audit gap where the original evaluation result is lost forever.

Attack Vector:

  1. Malicious insider or compromised process runs test-both
  2. Original evaluation shows "ABORT" with critical security concerns
  3. Test-both changes recommendation to "SKIP", erasing the abort rationale
  4. Later forensic analysis cannot determine why the critical concerns were ignored
  5. Vulnerable code proceeds to production

Evidence:
Compare to _override_skip() which also mutates evaluation but at least the original evaluation is preserved in history. However, test-both replaces the evaluation in-place.

Impact:

  • Loss of security audit trail
  • Inability to forensically analyze why risky code was approved
  • Compliance violations (no evidence of who approved security exceptions)
  • Insider threat enablement

Recommendation:

  1. Never mutate last_evaluation in-place. Create a new evaluation entry:
    original_eval = state["last_evaluation"]
    new_evaluation = {
        "original_recommendation": original_eval["recommendation"],
        "recommendation": "SKIP" if verdict == "approach_a" else "CONTINUE",
        "override_source": "test-both",
        "verdict": verdict,
        "timestamp": now_utc(),
        "original_evaluation_ref": original_eval
    }
    state["last_evaluation"] = new_evaluation
  2. Preserve the original evaluation in history or a separate audit log
  3. Add mandatory --reason validation (currently defaults to empty string)
  4. Log all state transitions to an append-only audit log

Medium Severity Vulnerabilities

4. Unsafe JSON Deserialization from Worker Output

OWASP Category: Insecure Deserialization
Severity: Medium
Location: megaplan/cli.py:1051-1053

Vulnerable Code:

atomic_write_json(plan_dir / test_both_filename, worker.payload)
verdict = worker.payload["verdict"]
rationale = worker.payload["verdict_rationale"]

Risk:
The code directly trusts the worker.payload returned from AI worker execution. While there is schema validation in validate_payload(), the validation happens in the worker execution path (workers.py:404-406), and the validation is based on JSON Schema which has known limitations:

  1. String length limits may not be enforced (DoS via huge strings)
  2. Nested object depth not validated (DoS via deep recursion)
  3. Additional properties may be allowed depending on schema strictness
  4. The schema for test-both.json doesn't include maxLength constraints

Attack Vector:

  1. Attacker compromises AI worker output (e.g., via prompt injection)
  2. Worker returns payload with:
    • Extremely long verdict_rationale (millions of characters)
    • Deeply nested synthesis_description
    • Malicious field names in additional properties
  3. atomic_write_json() writes massive file causing disk exhaustion
  4. String operations cause memory exhaustion

Evidence:
Schema at megaplan/schemas.py:254-286 has no maxLength constraints:

"verdict_rationale": {"type": "string"},
"synthesis_description": {"type": "string"},

Impact:

  • Denial of Service via resource exhaustion
  • Disk space exhaustion
  • Memory exhaustion during JSON serialization
  • Potential for log injection if rationale is logged

Recommendation:

  1. Add strict length limits to schema:
    "verdict_rationale": {"type": "string", "maxLength": 5000},
    "synthesis_description": {"type": "string", "maxLength": 10000},
  2. Validate string lengths before writing:
    MAX_RATIONALE_LENGTH = 5000
    if len(rationale) > MAX_RATIONALE_LENGTH:
        raise CliError("payload_too_large", f"verdict_rationale exceeds {MAX_RATIONALE_LENGTH} chars")
  3. Sanitize strings before logging (escape control characters, newlines)
  4. Implement maximum file size checks in atomic_write_json()

5. Hardcoded Agent Routing in State Mutation

OWASP Category: Security Misconfiguration
Severity: Medium
Location: megaplan/_core.py:213

Vulnerable Code:

DEFAULT_AGENT_ROUTING: dict[str, str] = {
    "integrate": "claude",
    "execute": "codex",
    "review": "codex",
    "test-both": "claude",  # <- New entry
}

Risk:
The agent routing is hardcoded rather than being part of the validated configuration. This creates several issues:

  1. No runtime validation that the agent is appropriate for the task
  2. No isolation between agents - a compromised route affects all users
  3. No audit of which agent was selected before execution
  4. The test-both feature runs with the same agent routing as regular operations despite being a privileged override

Attack Vector:

  1. If an attacker can modify DEFAULT_AGENT_ROUTING (e.g., via monkey patching, environment manipulation, or module replacement):
    • Route security-critical test-both operations to a compromised/malicious agent
    • Route all operations through a single agent to enable credential harvesting
  2. No defense-in-depth: if claude agent is compromised, all test-both verdicts are compromised

Evidence:
The routing is defined in _core.py (shared module) rather than per-project configuration. Tests can override agents via args.agent but there's no validation that overrides are authorized.

Impact:

  • Reduced defense-in-depth
  • Tampering with agent selection
  • Credential exposure to wrong agent
  • Audit trail gaps (which agent was authorized vs. which actually ran?)

Recommendation:

  1. Move agent routing to per-project configuration with signature verification
  2. Require explicit authorization for agent overrides:
    def resolve_agent(step, args, state):
        configured_agent = state["config"]["agent_routing"].get(step)
        if args.agent and args.agent != configured_agent:
            if not state["config"].get("allow_agent_override"):
                raise CliError("unauthorized", "Agent override not permitted")
            log_audit_event("agent_override", step, configured_agent, args.agent)
        return args.agent or configured_agent
  3. Log all agent selections to audit trail
  4. Consider separate agent pools for override operations (defense-in-depth)

Low Severity Vulnerabilities

6. Missing Authorization Check for Override Operations

OWASP Category: Broken Access Control
Severity: Low
Location: megaplan/cli.py:1030

Vulnerable Code:

def _override_test_both(plan_dir: Path, state: PlanState, args: argparse.Namespace) -> StepResponse:
    if state["current_state"] != STATE_EVALUATED:
        raise CliError(...)
    recommendation = state["last_evaluation"].get("recommendation")
    if recommendation not in {"ESCALATE", "ABORT"}:
        raise CliError(...)
    # No authorization check - anyone who can run the command can override

Risk:
There is no authorization check to determine if the user/process invoking test-both override is authorized to bypass the evaluation system. Any user/process that can execute the megaplan CLI can:

  • Override critical ABORT evaluations
  • Force plan execution despite security concerns
  • Bypass robustness checks

Attack Vector:

  1. Junior developer or CI system invokes megaplan override test-both
  2. No check whether they have permission to override security evaluations
  3. Critical security concerns are bypassed
  4. Vulnerable code proceeds to gate/execute stages

Evidence:
Other override operations (force-proceed, skip) also lack authorization checks. The only protection is file system permissions on the plan directory.

Impact:

  • Unauthorized bypass of security controls
  • Insider threat: any user can override safety mechanisms
  • No separation of duties (same person who wrote code can override their own evaluation)

Recommendation:

  1. Implement role-based access control:
    def require_override_permission(state, override_type):
        current_user = get_current_user()
        allowed_users = state["config"].get("override_authorized_users", [])
        if current_user not in allowed_users:
            raise CliError("forbidden", f"User {current_user} not authorized for {override_type} override")
  2. Require multi-party approval for critical overrides (e.g., ABORT → proceed)
  3. Add --confirm flag for destructive operations
  4. Log all override attempts (authorized and unauthorized) to audit log

Informational Findings

7. Code Quality: Inconsistent Error Handling

Severity: Info
Location: megaplan/cli.py:1047-1049

Code:

try:
    worker, agent, mode, refreshed = run_step_with_worker("test-both", state, plan_dir, args, root=root)
except CliError as error:
    record_step_failure(plan_dir, state, step="test-both", iteration=state["iteration"], error=error)
    raise

Observation:
Good practice to record step failures, but:

  1. Only catches CliError, not other exceptions (ValueError, OSError, etc.)
  2. State may be corrupted if record_step_failure() itself fails
  3. No cleanup of partial artifacts

Recommendation:

try:
    worker, agent, mode, refreshed = run_step_with_worker("test-both", state, plan_dir, args, root=root)
except Exception as error:
    try:
        record_step_failure(plan_dir, state, step="test-both", iteration=state["iteration"], error=error)
    except Exception:
        pass  # Don't mask original error
    if isinstance(error, CliError):
        raise
    raise CliError("worker_error", f"test-both failed: {error}") from error

No Vulnerabilities Found (Clean Areas)

Command Injection: Worker execution uses proper subprocess argument lists (no shell=True)
SQL Injection: No database operations
XSS: No web interfaces
XXE: No XML parsing
Known Vulnerabilities: No new dependencies added


Summary Table

Finding OWASP Category Severity Fix Complexity
Path Traversal via _test_both_root Injection / Access Control Critical Medium
TOCTOU Race Condition Broken Access Control High High
Evaluation Mutation Without Audit Insufficient Logging High Low
Unsafe JSON Deserialization Insecure Deserialization Medium Low
Hardcoded Agent Routing Security Misconfiguration Medium Medium
Missing Authorization Check Broken Access Control Low Medium
Inconsistent Error Handling N/A Info Low

Recommendations Priority

  1. IMMEDIATE (Critical): Fix path traversal by removing hasattr() pattern and adding proper argument validation
  2. HIGH PRIORITY: Implement optimistic locking to prevent TOCTOU race conditions
  3. HIGH PRIORITY: Preserve original evaluation in audit trail before mutation
  4. MEDIUM PRIORITY: Add maxLength constraints to schema and validate payload sizes
  5. MEDIUM PRIORITY: Move agent routing to configuration with validation
  6. LOW PRIORITY: Add authorization checks for override operations

Conclusion

The test-both override mechanism introduces significant security risks primarily around:

  1. Path traversal enabling malicious schema loading
  2. Race conditions allowing state machine bypass
  3. Audit trail gaps hiding security decisions

While the feature serves a legitimate purpose (breaking deadlocks), the implementation requires hardening before production use. The attack surface is especially concerning because this is an override mechanism designed to bypass normal safety checks - making it a high-value target for attackers.

Recommendation: Do not merge until Critical and High severity issues are resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants